[PT2E] Fix per-tensor observer issue with varying shape & rank #2177

Xia-Weiwen · 2025-05-06T12:09:32Z

Fixes #2094 and #2112
We may find inputs with varying shapes and ranks, e.g. when running Resnet18. The current implementation is based on block_size, which is not enough for such cases. The fix is simple: use block_size = -1 for each dimension for per-tensor quantization and update block_size for each input when inserting q/dq in convert.

pytorch-bot · 2025-05-06T12:09:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2177

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b03b1e6 with merge base 07ca637 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-05-06T17:37:03Z

torchao/quantization/pt2e/observer.py

@@ -1891,6 +1891,10 @@ def convert(self, model: torch.fx.GraphModule, observer_node: Node):
            assert self.original_dtype is not None, (
                "Expecting original_dtype to be populated"
            )
+            # Since input shape & rank may change (e.g. Resnet18), here we need to update block_size for each input
+            self.block_size = get_block_size(


when does this happen? can you give an example? I thought using -1 for dynamic dimensions will be enough?

To reproduce the issue, you may run the code here: #2094 (comment)
You will have to using -1 for block_size without updating of self.block_size here.

are you saying the rank / number of dimension changes for input as well? can we use a single -1 to represent this case?

are you saying the rank / number of dimension changes for input as well?

Yes

can we use a single -1 to represent this case?

I think it's doable. But there are checks to guard len(self.block_size) == len(input.shape). We need to handle the special case for per-tensor quant at these locations. Is it ok?

Xia-Weiwen · 2025-05-12T02:08:01Z

@jerryzh168 Could you please review this PR? Thanks.

Xia-Weiwen · 2025-05-13T11:38:00Z

Hi @jerryzh168 @drisspg Could you please review this PR? I am not sure if the current implementation is what you expected. Thanks.

jerryzh168 · 2025-05-13T17:52:59Z

torchao/quantization/pt2e/_affine_quantization.py

@@ -113,7 +113,8 @@ def _get_reduction_params(block_size, input_size):
          shape_for_reduction: (3, 3, 5, 2, 10)
          reduction_dim: [0, 1, 3, 4]
    """
-    assert len(block_size) == len(input_size)


is this still used? we should be using the code in quant_primitives.py I think

Yes. It's still used when running the prepared model (model after prepare_pt2e). Is it a bug? Do I need to fix it, too?

I am using the observers defined here: torchao/quantization/pt2e/_affine_quantization.py

Hi @jerryzh168 May I know your suggestion on this? Thanks.

I think we should be using the ones in torchao/quantization/observer.py eventually

only occurrence seems to be

ao/test/quantization/pt2e/test_quantize_pt2e.py

Line 2531 in 554cb60

AffineQuantizedMinMaxObserver,

and we want to update it I think

so if you are adding new things I'd recommend use the one from torchao.quantization

@Xia-Weiwen sorry for the delay, please feel free to work on this

I thought we already use the one from torchao:

ao/torchao/quantization/pt2e/prepare.py

Line 35 in 212d912

PartialWrapper,

but if you saw we are using torch.ao please go ahead and change them

@jerryzh168 Thanks for the reply. I did not mean torch.ao. I meant there are two versions of such utilities in torchao, torchao.quantization.pt2e and torchao.quantization. For example,

ao/torchao/quantization/pt2e/observer.py

Line 84 in 5153bd3

class PartialWrapper:

and

ao/torchao/quantization/observer.py

Line 32 in 5153bd3

class _PartialWrapper:

The PT2E flow in torchao uses those in torchao.quantization.pt2e while you said you wanted to switch to torchao/quantization/observer.py.
So, I was asking whether you would switch to torchao/quantization/observer.py in PT2E flow first. Do you have any suggestions on that? Thanks.

oh I see, yeah for now use torchao/quantization/observer.py would be better I think, we haven't finalized the folder structure for this one yet

@jerryzh168 Am I supposed to wait until you finalize the folder structure? Thanks.

torchao/quantization/pt2e/_affine_quantization.py

jerryzh168 · 2025-05-21T20:10:05Z

would be good to add a test for this

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 6, 2025

Xia-Weiwen added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label May 6, 2025

Xia-Weiwen changed the title ~~[PT2E] Fix per-tensor observer issue with varing shape & rank~~ [PT2E] Fix per-tensor observer issue with varying shape & rank May 6, 2025

[PT2E] Fix per-tensor observer issue with varying shape & rank

2ac41fb

Xia-Weiwen force-pushed the fix_per_tensor_quant branch from 87f1249 to 2ac41fb Compare May 6, 2025 12:12

jerryzh168 reviewed May 6, 2025

View reviewed changes

Xia-Weiwen mentioned this pull request May 8, 2025

[PT2E] observers do not handle inputs with different shapes correctly #2112

Open

block_size = [-1] for per-tensor quantization

330e2c0

Xia-Weiwen requested a review from jerryzh168 May 8, 2025 07:15

Xia-Weiwen requested a review from drisspg May 13, 2025 01:28

Xia-Weiwen marked this pull request as ready for review May 13, 2025 01:28

jerryzh168 reviewed May 13, 2025

View reviewed changes

torchao/quantization/pt2e/_affine_quantization.py Outdated Show resolved Hide resolved

Refine code

b03b1e6

[PT2E] Fix per-tensor observer issue with varying shape & rank #2177

Are you sure you want to change the base?

[PT2E] Fix per-tensor observer issue with varying shape & rank #2177

Uh oh!

Conversation

Xia-Weiwen commented May 6, 2025

Uh oh!

pytorch-bot bot commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2177

✅ No Failures

Uh oh!

jerryzh168 May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen commented May 12, 2025

Uh oh!

Xia-Weiwen commented May 13, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jerryzh168 commented May 21, 2025

Uh oh!

Uh oh!

pytorch-bot bot commented May 6, 2025 •

edited

Loading

jerryzh168 May 6, 2025 •

edited

Loading